Reliability and validity of specific quality of life assessment questionnaires related to chronic venous insufficiency: a systematic review

Abstract This systematic review aimed to discuss the main findings regarding the reliability and validity of health-related quality of life questionnaires for chronic venous insufficiency. Searches were performed on the MEDLINE, CINAHL, Web of Science, LILACS, and Scopus databases. The search terms used were related to “venous insufficiency”, and “quality of life”. The CIVIQ-20 and CIVIQ-14 instruments had adequate internal consistency and both were able to discriminate disease severity. The VEINES-QoL showed adequate internal consistency but was not able to discriminate disease severity. Most studies did not demonstrate a correlation between VEINES-QoL and the mental component of the SF-36. The AVVQ had inadequate reliability but its validity was also doubtful when compared to the SF-36. The VARIShort demonstrated good internal consistency, reproducibility, and validity, but only the original study was included. For venous leg ulcers, the CCVUQ showed adequate reliability and validity when compared to VLU-QoL.


INTRODUCTION
Chronic venous insufficiency (CVI) is a health condition caused by venous valve incompetence, usually associated with calf pump dysfunction. [1][2][3] The signs and symptoms of CVI span a wide spectrum of severity, ranging from asymptomatic to active and recurrent venous leg ulcers. 4 Severity levels can be assessed according to Clinical, Etiological, Anatomical, and Pathophysiological (CEAP) class, stratifying patients by presence of telangiectasias or reticular veins (C1), varicose veins (C2), edema (C3), trophic abnormalities (C4), healed ulcer (C5), and active ulcer (C6). 5 Chronic venous insufficiency prevalence rates are high and CVI affects about 25% of the general population. 6 Thus, its treatment generates significant costs for patients and healthcare systems. 7 Furthermore, studies have shown that when patients with CVI are compared to healthy individuals, they have reduced lower limbs muscle strength 8 and ankle range of motion, 9 changes in gait and balance, 2 and, consequently, worse health-related quality of life (HRQoL). 10 A previous study recommended use of HRQoL assessment in clinical monitoring and patient management 11 to guarantee analysis of the true impact of diseases on daily life. The HRQoL questionnaires used in several different cardiovascular pathologies and for long-term patient follow-up have emerged as markers of clinical improvement and offer a means for stratification of patient risk. 12 In the setting of CVI, HRQoL questionnaires have been used as a valuable tool to improve decision-making, such as deciding on referral to specialized centers. 13 Many studies have addressed assessment of HRQoL in this population and also the effect of interventions on HRQoL. [14][15][16] However, assessment of HRQoL in patients with CVI is complex, since its clinical expression ranges from esthetic factors to functional components. 17 There is therefore a need to critically discuss the psychometric properties, i.e., reliability, validity, and responsiveness, of the specific questionnaires available to ensure proper use. Reliability and validity stand out among the psychometric properties evaluated by researchers and professionals. 18,19 Reliability refers to an instrument's ability to reproduce consistent results, involving aspects such as coherence, stability, precision, equivalence, and homogeneity. 20 On the other hand, validity is not a characteristic of the instrument, but rather it refers to the instrument's ability to measure exactly what it proposes to measure in a defined population. 21,22 Careful analysis of reliability and validity is therefore useful for routine clinical practice.
Questionnaires must provide valid data for a specific population. For example, a questionnaire may be valid for assessing the HRQoL of patients with peripheral arterial disease, but not for patients with CVI. In addition, measurements must provide scientifically robust and reliable results. It is therefore necessary to determine the reliability and validity of HRQoL questionnaires before they are administered. Data on the aforementioned psychometric properties of HRQoL questionnaires in the context of CVI are scarce and remain unclear. 23 Determining these psychometric properties should assist in the choice of which questionnaires are appropriate for specific target populations. The present study aimed to critically discuss the main findings available in the literature on the reliability and validity of disease-specific HRQoL questionnaires available for patients with CVI.

Study design
This study is a systematic review of cross-sectional, cohort, or case-control studies. The protocol was registered on the Open Science framework (protocol available at https://osf.io/fsuwj/) and written following the guidelines of the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) statement 24 and the Cochrane recommendations. 25

Search strategy and study selection
Searches were conducted on the Medical Literature Analysis and Retrieval System Online (MEDLINE), the Cumulative Index of Nursing and Allied Health Literature (CINAHL), the Web of Science, Latin American & Caribbean Health Sciences Literature (LILACS), and Scopus, with no language or date restrictions, from database inception to July 2021. Searches were conducted independently by 2 authors (ILGIA and WTS) from April to July 2021. Disagreements were resolved by a third reviewer (HSC). Search terms were related to "venous insufficiency" and "quality of life". After searching the databases, the references retrieved were exported to an Endnote® file and duplicates were removed (duplicates not found by the software were deleted manually). The following strategy was used for the PubMed search: ("venous insufficiency" OR "venous disease" OR "Chronic venous disease") AND ("quality of life" OR "health-related quality of life") and was modified as appropriate for each of the other databases.

Eligibility criteria
Eligibility criteria were studies that evaluated the reliability and validity of HRQoL questionnaires in patients with CVI, regardless of the degree of severity according to CEAP classification. Thus, studies of patients of both sexes, of any age, from any health institution were considered eligible. Potentially eligible studies were excluded if they: 1) were duplicates, 2) did not assess HRQoL in the CVI population, 3) did not assess the reliability and/or validity of specific HRQoL questionnaires for CVI patients, 4) were review articles, or 5) investigated samples with postthrombotic syndrome.

Quality assessment
The methodological quality of the studies included was verified using the Newcastle-Ottawa Scale adapted for cross-sectional studies, 26 as recommended by the Cochrane Collaboration. The scale was developed by the Universities of Newcastle, Australia, and Ottawa, Canada, and comprises 8 items grouped under 3 topics, namely, selection, comparability and confounders, and outcome. For the quality assessment, a scoring criterion from zero to 10 stars was used, grouped into 3 items: selection, comparability, and outcome. The higher the number of stars, the higher the methodological quality of the study. The maximum score for "Selection" is five stars, the maximum score for "Comparability" is two stars, and the score for "Outcome" can be a maximum of three stars. For the risk of bias assessment, studies that scored in all domains (selection, comparability, and outcome) were classified as high quality. 27 Those that did not score in at least one of the domains were rated as low quality. All of the studies found in the electronic search were included in the review, regardless of methodological quality.

Outcomes and data analysis
The following data were extracted from the articles included in this review: authors, year of publication, sample characteristics (sample size, age, CEAP class, percentage of women), HRQoL questionnaires used, psychometric properties (reliability and validity), and methodological quality. The primary outcomes were those related to reliability and validity. If any of these data were missing, the study's corresponding author was contacted.
Internal consistency and inter-rater and intra-rater repeatability were considered as measures of reliability. Internal consistency indicates whether all subparts of an instrument measure the same characteristic and is generally verified using Cronbach's alpha coefficient. 28 Cronbach's alpha coefficient reflects the degree of covariance between items on a scale. Repeatability measures the degree to which similar results are obtained at two different times, i.e., it is the estimate of the stability of the measures. 29 Repeatability is evaluated using intraclass correlation coefficients (ICC) (continuous variables) or the Kappa index (categorical variables). Values equal to or greater than 0.70 were considered adequate for Cronbach's alpha coefficients, ICCs, and the Kappa index. 20,30,31 Validity was evaluated with (1) the correlation between the specific questionnaire score and scores on the SF-36 (criterion validity) or other HRQoL questionnaires (construct validity), and (2) hypothesis testing (construct validity), based on examining the differences in scores between samples with different levels of disease severity. Coefficients above 0.5 for correlations between the specific questionnaire and another standard questionnaire were considered adequate. 32 For hypothesis tests, the questionnaire was classified as adequate when scores were statistically different (p<0.05) between different CEAP classes.

RESULTS
The flow of studies through the review is illustrated in Figure 1. Initial research identified 3,359 studies, 2,023 (60.2%) of which were duplicates. After screening titles and abstracts, 1,215 papers were excluded. Most of them were reviews, studies that did not assess HRQoL in patients with CVI, or studies that did not verify the psychometric properties of the questionnaires. A further 94 studies were excluded after reading the full texts and a total of 27 papers were included in the present review.

Chronic Lower Limb Venous Insufficiency Questionnaire (CIVIQ)
Four studies that verified the psychometric properties of the CIVIQ-20 (n=6,776) and three that assessed the CIVIQ-14 (n=6,092) were included in this review. The mean quality score for studies assessing the CIVIQ-20 was 5.8 (ranging from 4 to 7) and two of these studies were of high quality and three were of low quality. The mean quality score for studies assessing the CIVIQ-14 was 6.3 (ranging from 5 to 8) and all three were of high quality. The sample details, reliability, and validity properties for the studies assessing the CIVIQ-20 and CIVIQ-14 are shown in Table 1.
The CIVIQ-20 is a disease-specific questionnaire developed in French and validated by Launois and colleagues, 35 although an English version is also presented in the manuscript. The questionnaire was designed to verify the HRQoL of patients with CVI and CEAP classes from 0 to 4. The instrument comprises 20 items in 4 dimensions (physical, psychological, and social aspects and pain). The higher the score, the worse the patient's HRQoL. The original version had adequate internal consistency for the psychological, physical, and pain dimensions (Cronbach's α ranging from 0.83 to 0.90), but values for the social dimension were inadequate (Cronbach's α=0.67). Reproducibility was reported using the correlations between the scores, thus, the ICC was not calculated. Hypothesis testing demonstrated a higher score in CVI patients with versus without arteritis (p<0.001).
In a multicenter study conducted in 18 countries, 34 the CIVIQ-20 showed adequate internal consistency (Cronbach's α=0.94) and the questionnaire was effective in discriminating levels of CVI severity.
Three cross-cultural adaptations [36][37][38] of the CIVIQ-20 verified the reliability and validity of the questionnaire and, despite including patients with venous leg ulcers, all of them showed adequate internal consistency of the global index (Cronbach's α ranging from 0.93 to 0.94). The Turkish adaptation 37 also found adequate inter-rater repeatability (ICC=0.80) and construct validity for assessment of HRQoL when compared to VEINES-QoL (r= -0.574; p <0.001). Finally, the Dutch version 36 seems to be a valid questionnaire to evaluate the physical aspects of HRQoL in patients with CVI (r= -0.64 when compared to the physical component of the SF-36), but not the mental aspects (r= -0.42 when compared to the mental component of the SF-36).
The CIVIQ-14 is a shorter version of the CIVIQ-20 and was described in 2011 by Launois et al., 33 aiming to obtain a more stable questionnaire. This new version comprised 14 items in 3 dimensions (physical aspects, psychological aspects, and pain). The original version, tested in several languages, showed adequate test-retest reliability in all domains (ICC ranging from 0.88 to 0.94). The validity of the questionnaire was not verified by comparison with a standard questionnaire, but by correlations between the CIVIQ-14 score and signs of disease severity (cramps, heavy legs, sensation of swelling, and pain).
Two cross-cultural adaptations were found 38,39 and both Serbian and Croatian versions showed adequate internal consistency (Cronbach's α of 0.93 and 0.94, respectively). Hypothesis testing showed that the scores for both cross-cultural adaptations were able to discriminate HRQoL between different CEAP classes (p<0.01 for both).

Venous Insufficiency Epidemiological and Economic Study (VEINES)
The results of the seven studies included that evaluated the reliability and validity of VEINES are shown in Table 2. The mean score for quality was 5.7 (ranging from 3 to 6) and six were of high quality.
The VEINES-QoL/Sym was developed by Lamping and colleagues from the United Kingdom, Belgium, and Canada, 40 and first administered in English, French, Italian, and Canadian French. The instrument consists of 26 items related to symptoms (10 items), limitations in daily activity (9 items), the hour of the day with the highest intensity of symptoms (1 item), changes in HRQoL over the last year (1 item) and the psychological impact caused by the disease (5 items). Two scores are generated from the questionnaire: the VEINES-QoL related to the HRQoL, and the VEINES-Sym, related to the presence of symptoms. Twenty-five of the 26 items in the questionnaire are The stars indicate the quality scores evaluated with the Newcastle-Ottawa scale for cross-sectional studies, ranging from zero to 10 stars. The higher the number of stars, the higher the methodological quality of the study. The maximum score for the "Selection" item is five stars, the maximum score for the "Comparability" item is two stars, and the maximum score for the "Outcome" item is three stars.  The stars indicate the quality scores evaluated with the Newcastle-Ottawa scale for cross-sectional studies, ranging from zero to 10 stars. The higher the number of stars, the higher the methodological quality of the study. The maximum score for the "Selection" item is five stars, the maximum score for the "Comparability" item is two stars, and the maximum score for the "Outcome" item is three stars.
The stars indicate the quality scores evaluated with the Newcastle-Ottawa scale for cross-sectional studies, ranging from zero to 10 stars. The higher the number of stars, the higher the methodological quality of the study. The maximum score for the "Selection" item is five stars, the maximum score for the "Comparability" item is two stars, and the maximum score for the "Outcome" item is three stars.  The stars indicate the quality scores evaluated with the Newcastle-Ottawa scale for cross-sectional studies, ranging from zero to 10 stars. The higher the number of stars, the higher the methodological quality of the study. The maximum score for the "Selection" item is five stars, the maximum score for the "Comparability" item is two stars, and the maximum score for the "Outcome" item is three stars.  The stars indicate the quality scores evaluated with the Newcastle-Ottawa scale for cross-sectional studies, ranging from zero to 10 stars. The higher the number of stars, the higher the methodological quality of the study. The maximum score for the "Selection" item is five stars, the maximum score for the "Comparability" item is two stars, and the maximum score for the "Outcome" item is three stars.  The stars indicate the quality scores evaluated with the Newcastle-Ottawa scale for cross-sectional studies, ranging from zero to 10 stars. The higher the number of stars, the higher the methodological quality of the study. The maximum score for the "Selection" item is five stars, the maximum score for the "Comparability" item is two stars, and the maximum score for the "Outcome" item is three stars.  The stars indicate the quality scores evaluated with the Newcastle-Ottawa scale for cross-sectional studies, ranging from zero to 10 stars. The higher the number of stars, the higher the methodological quality of the study. The maximum score for the "Selection" item is five stars, the maximum score for the "Comparability" item is two stars, and the maximum score for the "Outcome" item is three stars.  40 showed adequate internal consistency in all languages. It also showed adequate reproducibility in the languages in which it was evaluated, that is, English and French. Regarding validity, only the questionnaire in the Italian language showed an adequate correlation with both the physical and mental components of the SF-36. In English, French, and Canadian French, the VEINES-QoL had an adequate correlation with the physical component of the SF-36, but an inadequate correlation with its mental component.
In a multicenter study with CVI patients with venous leg ulcers conducted in England and Northern Ireland, 44 the VEINES-QoL showed adequate internal consistency, reproducibility, and validity according to comparisons with both physical and mental components of the SF-12.
Six cross-cultural adaptations that evaluated the reliability and validity of the VEINES-QoL were found: two in Turkish populations, 41,42 two in Swedish populations, 45,52 one in a Dutch population, 43 and one in a Brazilian 46 population. Internal consistency was adequate in all five versions in which it was evaluated (it was not evaluated in the Brazilian adaptation). Reproducibility was adequate in the two versions in which it was evaluated. 42,52 The results regarding validity were heterogeneous. In general, the score showed an inadequate correlation with the mental component or with the mental health domain of the SF-36, except for in a study by Tuygun et al. 42 In this study, the validity was adequate for both the physical and mental components. Additionally, three studies 43,45,46 found no significant differences between CEAP classes, while just one 41 showed a significant difference between classes.

Aberdeen Varicose Vein Questionnaire (AVVQ)
Five studies were found that investigated the reliability or validity of the AVVQ. The mean score for quality was 6.8 (ranging from 6 to 7) and four articles were of high quality. The results of these studies are shown in Table 3.
The Aberdeen Varicose Vein Questionnaire (AVVQ) is a disease-specific questionnaire that assesses HRQoL in patients with varicose veins. It was developed in Scotland in 1993 by Garratt et al. 51 Briefly, the 13 items evaluate dimensions related to pain, use of analgesics, social issues, and interference caused by varicose veins at work, household chores, and leisure. The higher the score, the worse the patient's HRQoL.
The questionnaire was initially administered to 281 patients with varicose veins and demonstrated adequate, but borderline, internal consistency (Cronbach's α=0.72). Additionally, when compared with the SF-36, the score of the original AVVQ version had inadequate measures of validity (weak to moderate correlations with the SF-36 domains; r values from -0.25 to -0.49). Using the same questionnaire in English in the United Kingdom, Smith et al. 47 also found an adequate and borderline internal consistency (Cronbach's α=0.74). In addition, the questionnaire was not valid for assessment of patients' HRQoL, since the score showed weak correlations with all SF-36 domains (all correlation coefficients were below 0.5).
This questionnaire was culturally adapted for the Dutch and Brazilian populations. [48][49][50] The Dutch adaptation demonstrated adequate (but also borderline) internal consistency, 49 the ability to discriminate different levels of disease severity (CEAP 1 and 2 versus 3 and 4 versus 5 and 6; p<0.01), 49 and weak to moderate correlation with the SF-36 domains, especially those related to physical aspects. 48 In the test-retest analysis, a significant and strong association between two scores (2-week interval) was reported (r=0.87, p<0.01), with no difference between them (p=0.12). 49 The ICC value was not reported.
The Brazilian adaptation 50 showed no internal consistency for the global index (Cronbach's α=0.54), or for the domains varicose extension (Cronbach's α=0.64) and complications (Cronbach's α=0.29). The Brazilian version also demonstrated adequate intra-rater (ICC = 0.85) and inter-rater (ICC = 0.95) reproducibility and effectiveness for discriminating different levels of severity, but demonstrated weak to moderate correlations with SF-36 domains. It is noteworthy that all cross-cultural adaptations included patients in CEAP classes 1 to 6.

VARIShort
The Swedish version of a short patient-reported outcome for superficial venous insufficiency (VARIShort) is a questionnaire developed by Hultman and colleagues 52 to assess HRQoL in patients with superficial venous insufficiency (SVI), considered a "short" version of the VEINES-QoL/Sym. The authors report that the VEINES-QoL/Sym is not specific for SVI, so there was a need to create an easy and comprehensive patient-report measure. The new Swedish version consists of 7 items, 5 on symptoms, 1 on activity, and 1 on appearance. The original version Table 3. Characteristics of included studies that verified the reliability and validity of the AVVQ (n=5). The stars indicate the quality scores evaluated with the Newcastle-Ottawa scale for cross-sectional studies, ranging from zero to 10 stars. The higher the number of stars, the higher the methodological quality of the study. The maximum score for the "Selection" item is five stars, the maximum score for the "Comparability" item is two stars, and the maximum score for the "Outcome" item is three stars.  The stars indicate the quality scores evaluated with the Newcastle-Ottawa scale for cross-sectional studies, ranging from zero to 10 stars. The higher the number of stars, the higher the methodological quality of the study. The maximum score for the "Selection" item is five stars, the maximum score for the "Comparability" item is two stars, and the maximum score for the "Outcome" item is three stars.  The stars indicate the quality scores evaluated with the Newcastle-Ottawa scale for cross-sectional studies, ranging from zero to 10 stars. The higher the number of stars, the higher the methodological quality of the study. The maximum score for the "Selection" item is five stars, the maximum score for the "Comparability" item is two stars, and the maximum score for the "Outcome" item is three stars. (+) classified as adequate according to COnsensus-based Standards for the selection of health status Measurement INstruments (COSMIN); (-) classified as not adequate according to COSMIN; (?) not verified of the questionnaire, in Swedish, was administered to 525 patients at CEAP classes from C2 to C6, with mean age of 58.3 years, and 59% females. Therefore, the new measurement instrument covers patients who are in CEAP classes 2 or higher. The questionnaire showed adequate internal consistency (Cronbach's α=0.93) and intra-rater repeatability (ICC=0.93). In addition, the score demonstrated a strong correlation with the VEINES-QoL (r= -0.819; p<0.001). No other study or crosscultural adaptation assessing the psychometric properties of this questionnaire was found. The quality score for the study was 5 (3 stars for selection, none for comparability, and 2 for outcomes) and it was classified as low quality.

Questionnaires for patients with venous leg ulcers
Two questionnaires for assessing patients with venous leg ulcers were included: the CCVUQ (five studies) and the VLU-QoL (two studies). The characteristics of these studies are shown in Table 4. The mean score for quality of the CCVUQ studies was 5.2 (ranging from 4 to 7). The mean score for quality of VLU-QoL studies was 4.5 (ranging from 4 to 5). One VLU-QoL study was of high quality and the other was of low quality.
The CCVUQ was developed by Smith et al. 55 in a London teaching hospital and surrounding community clinics. The venous ulcer-specific questionnaire consists of 21 items, divided into 4 domains related to social interaction, domestic activities, and emotional and esthetic status. The higher the score, the worse the patient's HRQoL. The questionnaire showed adequate internal consistency and was considered a valid tool for evaluation of HRQoL when compared to SF-36 domains (correlation coefficients ranged from -0.52 to -0.71). The ICC was not evaluated, but the questionnaire seems to be stable due to the strong correlation (r=0.84; p<0.001) between the scores applied at two different times, with no significant difference between the scores (p=0.86). The psychometric properties of the questionnaires were verified in Chinese, 56 Brazilian, 57,59 and Uruguayan 58 populations. All cross-cultural adaptations showed good internal consistency. Intra-rater reproducibility has also been confirmed in Brazilian 57,59 and Chinese 56 populations. In the Brazilian population, CCVUQ scores had adequate correlations with the SF-36 domains physical functioning, general health, vitality, and social functioning.
The VLU-QoL questionnaire is also designed for patients with venous leg ulcers and was developed by Hareendran et al. 53 in a multicenter study with four participating centers in the United Kingdom. The 34-item instrument evaluates HRQoL in terms of activities (12 items), psychological aspects (12 items), and symptoms related to venous leg ulcers (10 items). Higher scores represent poorer HRQoL. The original version 53 was administered to 70 patients with venous leg ulcers and all domains showed adequate internal consistency and reproducibility. Regarding validity, the mental component of the SF-36 did not show an adequate correlation with any of the three domains of the VLU-QoL and the physical component of the SF-36 only did so with the activities domain. In hypothesis testing, patients who reported exudate, edema, and ulcer smell had lower scores on the VLU-QoL, but there was no significant association with hyperpigmentation. The ICC was not reported. The Brazilian cross-cultural adaptation of the questionnaire 54 also demonstrated adequate internal consistency and good reproducibility. Table 5 contains summarized results for the reliability and validity of the CVI-specific HRQoL questionnaires included in the present study.

DISCUSSION
The present study demonstrated the reliability and validity of specific questionnaires for assessing the HRQoL of patients with CVI. Systematic discussion of psychometric properties has an important clinical impact and we believe that our findings may guide questionnaire choice and support their use in patient follow-up.
The main findings of the present review were that: (1) the CIVIQ-20 and CIVIQ-14 showed good internal consistency in all domains, except for the social domain. Reproducibility was also adequate. However, few studies have addressed their validity; (2) the VEINES-QoL also showed excellent reliability, but the findings on validity are inconclusive. Thus, to date, the CIVIQ-20 or CIVIQ-14 should be used in patients without venous ulcers; (3) the AVVQ is a questionnaire developed for patients with varicose veins and it had low to inadequate internal consistency and weak correlation with the SF-36 domains; (4) reliability and validity of the VARIShort were reported by only one study; and (5) of the questionnaires for patients with venous leg ulcers, the CCVUQ showed adequate internal consistency, reproducibility, and correlations with many domains of the SF. Another questionnaire, the VLU-QoL, proved to be reliable, but its validity must be investigated. Therefore, for patients with venous leg ulcers, the CCVUQ seems to be the most appropriate in terms of reliability and validity.  The stars indicate the quality scores evaluated with the Newcastle-Ottawa scale for cross-sectional studies, ranging from zero to 10 stars. The higher the number of stars, the higher the methodological quality of the study. The maximum score for the "Selection" item is five stars, the maximum score for the "Comparability" item is two stars, and the maximum score for the "Outcome" item is three stars.   The stars indicate the quality scores evaluated with the Newcastle-Ottawa scale for cross-sectional studies, ranging from zero to 10 stars. The higher the number of stars, the higher the methodological quality of the study. The maximum score for the "Selection" item is five stars, the maximum score for the "Comparability" item is two stars, and the maximum score for the "Outcome" item is three stars.  The stars indicate the quality scores evaluated with the Newcastle-Ottawa scale for cross-sectional studies, ranging from zero to 10 stars. The higher the number of stars, the higher the methodological quality of the study. The maximum score for the "Selection" item is five stars, the maximum score for the "Comparability" item is two stars, and the maximum score for the "Outcome" item is three stars. (+) classified as adequate according to COnsensus-based Standards for the selection of health status Measurement INstruments (COSMIN); (-) classified as not adequate according to COSMIN; (?) not verified. Table 5. Summary of results for the reliability and validity of CVI-specific HRQoL questionnaires.

Reliability
Validity

CIVIQ-20
Internal consistency: α = from 0.93 to 0.94 Hypothesis testing: There were differences in the CIVIQ-20 global index between patients with and without arteritis (p<0.001), and in all dimensions among CEAP 0 to 4 (p<0.001).

VEINES-QoL
Internal consistency: α = from 0.86 to 0.94 Hypothesis testing: There were no differences in VEINES-QoL scores among CEAP classes in three studies. There were differences in VEINES-QoL scores among CEAP classes in one study.
Abbreviations   The stars indicate the quality scores evaluated with the Newcastle-Ottawa scale for cross-sectional studies, ranging from zero to 10 stars. The higher the number of stars, the higher the methodological quality of the study. The maximum score for the "Selection" item is five stars, the maximum score for the "Comparability" item is two stars, and the maximum score for the "Outcome" item is three stars.

VLU-QoL
Internal consistency: α > 0.8 Hypothesis testing: Patients who reported exudate, edema and ulcer smell had lower VLU-QoL scores but there was no significant association between VLU-QoL and hyperpigmentation. The CIVIQ-20 is available in the languages of 34 countries, 60 but its reliability has only been assessed for the original version and four crosscultural adaptations. In summary, the results suggest that the CIVIQ-20 and CIVIQ-14 have adequate internal consistency, both in the original version and its cross-cultural adaptations. Only the social dimension of CIVIQ-20 demonstrated inadequate internal consistency. The CIVIQ-14 is even more useful because it is shorter and faster to administer, which facilitates use. Another advantage is the large size of the samples in the studies reviewed that used the CIVIQ. Furthermore, a multicenter study 34 and the Dutch version 36 showed the CIVIQ-20 was accurate for discriminating disease severity. Similarly, in two crosscultural adaptations, 38,39 the CIVIQ-14 demonstrated adequate ability to discriminate disease stages. On the other hand, reproducibility and validity in comparison with the SF-36 or other HRQoL questionnaires have been explored little. Therefore, our results suggest that the CIVIQ-20 and CIVIQ-14 are consistent and can assist in identifying the magnitude of CVI severity in these patients, but further studies are required to confirm the validity of the questionnaire for assessing the HRQoL of patients with CVI.

Intra
The VEINES-QoL is a widely used questionnaire in patients with CVI. Internal consistency and reproducibility were adequate in the studies included in the present review, even in patients with venous leg ulcers. However, there are concerns with validity. Briefly, the VEINES-QoL did not demonstrate an adequate correlation with the mental component of the SF-36. Thus, we hypothesize that the questionnaire may cover more physical than mental aspects of CVI. Previous studies 61,62 reported social isolation and depression in patients with CVI and, therefore, the disease's impact on mental and emotional aspects should be highlighted in questionnaires. Furthermore, three of the studies included 43,45,46 failed to demonstrate the power to discriminate between CVI severity levels, which may indicate a possible limitation of the questionnaire for assessing HRQoL in patients at different stages of the disease. Based on validity, no evidence was found to support use of the VEINES-QoL for assessing the HRQoL of patients with CVI. In comparison, the CIVIQ-20 and CIVIQ-14 have both shown better performance in terms of validity, despite the small number of studies available. Thus, to date, use of CIVIQ-20 or CIVIQ-14 is recommended for assessing quality of life of patients with CVI but without venous ulcers.
The AVVQ questionnaire was designed for patients with varicose veins, mainly for verifying improvements in HRQoL after surgical interventions. [63][64][65] Some issues regarding the results found by the studies reviewed that used this questionnaire should be highlighted. First, the questionnaire's internal consistency is doubtful, even in the original article. 51 Adaptations in English and Dutch 47,48 presented low, although adequate, values for internal consistency, while the version adapted for Brazilian Portuguese 50 was inadequate. Second, the test-retest reproducibility needs to be explored further, since ICC was only calculated for the Brazilian version. 50 Third, the questionnaire's validity was not confirmed, since the score was only weakly correlated with the SF-36 domains. Finally, all of the cross-cultural adaptations included patients at different levels of disease severity, not just patients with varicose veins, who are the questionnaire's target population. One strength of the AVVQ score was the ability to discriminate between levels of disease severity in the Dutch and Brazilian versions. 49,50 However, this characteristic is questionable, since the population used in the comparison was not the questionnaire's target population.
In patients with venous leg ulcers, the original version of the CCVUQ showed adequate internal consistency in United Kingdom, Chinese, Brazilian, and Uruguayan populations. Reproducibility was also adequate, except for the Uruguayan version, 58 for which it was not assessed. The score of the original version showed an adequate correlation with all SF-36 domains. In the Brazilian population, 57 the correlation was only not adequate with some domains. Accuracy for discriminating between severity levels was not verified, since all patients had active or healed ulcers (CEAP 5 or 6). Therefore, the questionnaire appears to be consistent, stable, and valid for assessing the HRQoL of patients with venous leg ulcers. One limitation is that three of the five studies included had samples smaller than 100 patients, which is the recommended size for high-quality studies.
The VLU-QoL questionnaire also assesses the HRQoL of patients with venous leg ulcers and, like the CCVUQ, it also proved to be consistent and with adequate reproducibility in the versions in English and Brazilian Portuguese. 53,54 However, the validity of the questionnaire was only verified in the English version 53 and showed inadequate results for two of the three domains. Thus, according to the findings of this review, it is recommended that the CCVUQ should be used for patients with venous leg ulcers, since the psychometric properties of this questionnaire have been verified by a larger number of studies and it has more accurate validity.
The present study has some strengths and limitations. One limitation is that the number of studies assessing the psychometric properties of each questionnaire should have been higher. Additionally, many studies used samples that differed from the questionnaires' target populations. On the other hand, most of the studies included (66.6%) were classified as high quality in the risk of bias assessment, which makes the result consistent. Additionally, the present study did not only include questionnaires validated for a specific country or adapted for a specific language. Therefore, the present study summarized the results for the reliability and validity of disease-specific questionnaires for assessment of the HRQoL of patients with CVI. Thus, the results should assist health professionals to choose a reliable, valid, and disease-specific questionnaire for patients with CVI, in addition to contributing to future research.

CONCLUSION
The present study suggests that the CIVIQ-20 and CIVIQ-14 have potential value for assessment of HRQoL in patients with CVI (Cronbach's α ranging from 0.92 to 0.94, ICC greater than 0.8), regardless of the severity of the disease, despite their validity having been little reported. Additionally, the VEINES-QoL seems to be consistent (Cronbach's α ranging from 0.86 to 0.93) and reproducible (ICC greater than 0.8), but its validity is still doubtful. The AVVQ showed inadequate internal consistency (Cronbach's α ranging from 0.54 to 0.76) and validity. Among the questionnaires designed for patients with venous leg ulcers, the CCVUQ emerges as a reliable (Cronbach's α ranging from 0.83 to 0.95, ICC greater than 0.95) and valid tool. However, it is emphasized that all of these questionnaires have important limitations and, therefore, the results must be interpreted with caution.